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Article Info ABSTRACT 

Article history: A challenging task for the modern research is to accurately diagnose the 
diseases prior to their treatment. Particularly in rural areas, the instant 

Received Aug 18, 2018 diagnosis for a life style disease is rarely available; it becomes necessary to 

Revised Jul 6, 2019 use modern computing techniques to design intelligent prediction systems. 
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prediction problems in different fields like medical diagnosis, decision 
support systems, biochemical analysis, image processing and financial 
Keywords: analysis etc. The accuracy for thyroid diagnosis system may be improved by 
considering few additional attributes like heredity, age, anti-bodies etc. 
In this paper, an improved and intelligent thyroid disease prediction system is 
developed using multilayer perceptron (MLP) machine learning model. 


Intelligent systems 
Machine learning 


Multi-layer perceptron The proposed system uses 7 to 11 features of the individuals to classify them 
Pattern classifier in normal, hyperthyroid and hypothyroid classes. The system uses gradient 
Thyroid disease descent backpropogation algorithm for training the machine learning model 


using dataset of 120 subjects collected from SKIMS Hospital, Jammu and 
Kashmir. The thyroid prediction system promises excellent overall accuracy 
of nearly 99.8% for 11 attributes with more number training instances. 
However, the system results in a lower accuracy of 66.7% using 11 attributes 
and 70% using 7 attributes with 30 subjects. 
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1, INTRODUCTION 

Machine learning is a modern way of computing where knowledge alongwith a technique is used to 
build a model which imitates the behaviour of human being. Once the macine learning model is trained it will 
start predicting the class of a given feature set. As shown in the Figurel, a variety of machine learning 
techniques are available which may be categorised broadly into supervised, unsupervised and reinforcement 
learning. The typical examples of supervised machine learning algorithms includes Nearest neighbour 
classification, regression, Support vector machine (SVM), Artificial neural networks Naive base classifiers 
and decision trees. An Artificial neural network (ANN) is an information processing paradigm that is 
motivated by the way biological neural system 1.e. brain process the data. The neural network constitutes of 
countless interconnected information handling components called neurons. The key component of the neural 
network is a novel structure. Neural systems, with their efficient capability to derive meaningful information 
from imprecise information, can be utilized to separate and distinguish patterns that are too intricate to be 
noticed by any computer technique or by human.As ANN is a self learning framework, it shows distinctive 
classes of learning calculations, for example, supervised learning, unsupervised learning and reinforcement 
learning. ANNs are widely used in the real - world computation applications. The various areas of application 
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include pattern recognition, pattern classification and pattern prediction. The whole paradigm of predicting 
lifestyle disease is shifting from old conventional method to machine learning based prediction systems. 
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Figure 1. Machine learning taxonomy 


Thyroid disease is one among the common lifestyle disease. Thyroid organ is a butterfly-molded 
organ which is present in the neck underneath the mouth of human body. It release hormones that control 
metabolism like heart rate, body temperature etc. It produces two main hormones T3 and T4. 
These hormones are responsible for various metabolic activities like body weight, heart rate etc. These 
activities may get disturbed if the level of these hormones changes. So the diagnosis of thyroid disease is 
important prior to its treatment. About 32 percent of the total Indian population suffers from thyroid disease. 
The Thyroid disease may be broadly categorized i.e. hypothyroid and hyperthyroid. When the amount of 
hormones exceed the amount required by the human body, it causes hyperthyroidism. Hypothyroidism is the 
inverse of hyperthyroidism; it reduces body metabolism, cause drowsiness and pain in joints. 
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Figure 2.Mechanism of thyroid disease 


The rest of the article is organised as follows. The Section 2 of the article presents a brief 
background of various related life style disease prediction systems. The Section 3 explains about the machine 
learning based framework and algorithm of the proposed intelligent thyroid prediction system. The training 
and prediction accuracy of the proposed thyroid system at various levels is computed in the Section 4. 
The Section 5 of the article provides brief findings and future scope of the presented research. 


2. RELATED WORK 
Various researchers have used different pattern classifiers for developing lifestyle disease prediction 
systems. In this section a brief study of thyroid disease prediction system have been presented. 
I. D. Maysanjaya et.al. (2015) have used six different methods for the diagnosis of thyroid disease. 
After experimention, the performance of multilayer perception was found highest as compared to that of the 
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other five methods [1]. Mohd.Reza et.al. (2017) have discussed the diagnosis of different types 
of thyroid disease using ANN by considering the age of an individual. The input to the thyroid prediction 
system is seven hormone tests including age and the output is the diagnosis of the thyroid. The various ANN 
structures used includes MLP, PNN, GRNN AND CFNN [2]. Shivaneepanday et al (2016) have 
proposed various data mining techniques like Bayes net, multilayer perception, RBF network, L4.5, CART, 
REP tree, decision stump to develop classifiers for diagnosis of hypothyroid disease [20]. After performing 
the experiments, it is clear that REP tree and L4.5 techniques perform well as compared to others [3]. 
Mazin Abdul rasoolhameed et.al (2009) have proposed a method of classifying thyroid disease using 
multilayer feed forward using back propagation learning rule.In this work three inputs have been considered 
as T3,T4,TSH [4]. SaeedShariati and Mahdi MotanaliHaghighi (2010) have used fuzzy system to diagnosis 
hepatitis and thyroid disease. The results of fuzzy neural networks with support vector machine and artificial 
neural network were compared [5].AnupamShukla et.al. (2009) in their work have trained the system using 
three ANN algorithms, the backpropogation (BPA), the radial basis function (RBF) and the learning vector 
quantization (LVQ) [6]. Narender Kumar et.al.(2017) have used various data classifications techniques and 
their accuracy performance to predict chronic kidney disease [7]. Xing et.al. (2017) have proposed a 
technique which is concerned with the aim to develop a data mining algorithm to predict survival of CHP 
patients (Coronary Heart Disease). In this work, three algorithm's were used to develop these prediction 
models [8]. Hsiang et.al. (2006) have Compared Expert Judgment (knowledge based) and Automatic 
Approaches(data driven) in this paper, the authors have compared two different features selection techniques 
to extract features from a given data set. The result suggests that the automatic feature selection approach 
improve the prediction capability of a classifier while as the domain expert improves the sensitively of a 
classifier [9]. Rajeebdev et.al. (2008) in theirresearch proposed a binary classification problem for the 
diagnosis of = diabetes. A person suffering from diabetes fall in class 1 and non diabetic fall in class 2. They 
used backpropogation algorithm in Multilayer feed forward. In this, the authors used single as well as multi 
layer perceptron. Both the neural networks have six input nodes and one output node. The network 
successfully classified patients into diabetic and non diabetic with performance of 92.50% [10]. Canan et.al. 
(2009) proposed a hybrid structure of neural network and fuzzy logic. The experiment shows that the hybrid 
schemes have better results over the non hybrid structures [11]. ShradhaDeshmukh et.al. (2017) = proposed 
two important classification algorithms namely fuzzy min-max and pruning fuzzy min-max algorithms [12]. 
K Vishwanant et.al. (2014) proposed Multilayer Perceptron and Back Propagation ANN to distinguish the 
type of the stone. The multilayer perceptron with backpropogation gives high accuracy of 98% when 
contrasted with Naive Bayes [13]. Muthuselvan et.al (2016) focuses on implementing five different types of 
data mining techniques using a data mining tool called WEKA in order to predict breast cancer from blood 
data sets. The five algorithms include Naive Bayes, one R, Zero R, Random tree algorithm and j48. On 
comparing the performance of the various algorithms, it shows that J48 algorithm performance was highest 
1.e. 86.36% while as minimum (Zero R) is 56.81% [14].Madhuri et.al. (2013) proposed a computer aided 
artificial intelligence system used for diagnosis of stress [15]- [25]. 

N Ganesan et.al. (2010) used neural networks in the medical field for preclinical study. In this work 
the author have shown the various ways by which neural networks can be applied on clinical data for the 
diagnosis of lung cancer [16]-[23]. Sapna (2016) has proposed fusion of big data and neural networks for 
predicting thyroid. Clinical information is huge in volume, thus conventional data processing applications 
won't be sufficient to interpret big data, hence it needs innovations techniques to handle and extract important 
information from it [17]. FengyingXie et.al. (2017) build a novel technique for detecting tumor as amiable or 
threatening by analyzing images. In this research, they designed an ensemble classifier that combines back 
propagation neural network with fuzzy neural network [18]. De Araujo et.al. (2017) proposed a 
classical method for induction motors fault diagnosis do not always provide satisfactory results. The author 
proposes a hybrid system that uses data obtained from vibration, and current sensors to predict failures at an 
early stage [20]. The input to the system is based on fuzzy logic is given by processing the signals in the 
frequency and time domain through short time Fourier transform and multi resolution analysis [21-22]. The 
technique allows an increase in reliability in the detection and diagnosis in the level of severity as compared 
to existing techniques [19]-[24]. 


3. THYROID PREDICTION SYSTEM USING MACHINE LEARNING 

In order to address the major research gaps, the need is to design an improved thyroid disease 
pattern classifier system by including additional features like age group, heredity, antibodies. The blood test 
is the poorest and crudest method of determining whether a person is suffering from thyroid or not, so the 
better solution to solve this problem is to take into consideration more parameters. Moreover, the system may 
utilize better classifiers in order to improve the overall accuracy of the diagnostic system. The improved 
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thyroid system must using latest machine learning technique to train and then test the machine learning 
model. In this section the detailed framework and algorithms are presented. 


3.1. Proposed Framework 

The training and testing phases of the thyroid disease prediction system is clearly shown in the 
Figure 3. As shown in the Figure 3, the first step is to identify the typical parameters / risk factors which are 
responsible for the thyroid disease in human beings. In the next step, miscellaneous dataset of various 
patients of different categories is collected. In the conventional methods of thyroid diagnosis system, 
majority of authors have used only three factors namely T3, T4 and TSH. In the proposed diagnostic 
prediction system, more number of risk factors can be included. 








Choose Random 
Samples Patient 





Thyroid Patent 


Training Phase Testing Phase 


Figure 3. Framework for the proposed system 


In order to classify a particular patient into any of the three classes a dataset of 120 samples has been 
obtained and preprocessed. In order to remove anomalies, noise and to quantify Boolean values the data set is 
manually enriched. Once the dataset is prepared, a multilayer pattern classifier model is created and trained 
with the dataset. The MLP pattern classifier model is stored for the testing phase. In order to check the 
accuracy of thyroid predictions a sample of randomly chosen patients is applied on the stored MLP 
prediction system. 


3.2. Proposed Algorithm 

MLP is one of the most common ANN which is widely used for different tasks like pattern 
classification, pattern recognition etc. One of the most important features of MLP is that we can specify any 
number of output classes. The network architecture chosen for this problem is MLP having eleven input 
nodes and three output nodes. Each node present in the input layer is connected to every other node in the 
hidden layer through some weights. The value of the weighted input sum to a particular node maybe large, 
therefore it is important to scale down the weighted sum by reducing it before producing the resulted output 
of that particular node. For this purpose a function is applied on the weighted input. One of the best methods 
is backpropogation learning which works on the principle of gradient descent rule. The steps of the training 
and machine learning building algorithm are explained in the following section. The multilayer perceptron is 
trained with 11 nodes in the input layer of the network. 
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3.2.1 ThePseudo Code for Training the Thyroid Prediction System Using MLP 


1. Initialize weights and learning rate. 

2. Perform steps 3 to 10 till condition is false. 

3. Repeat steps 4 to 9 for each pair to be trained. 

4. Each input node say X; receives an input signal and pass it to the next node present in hidden layer. 

5. Each node in the hidden layer say h; sums its weighted input to calculate net input as (feed forward phase I) as shown 
in eq.1. 

Zinj = voj + Mii, Xivij (1) 


Activation function is then applied on the Zinj to calculate the output of the hidden node: 
Zj = f(Zinj) 

this output signal is then send as input to the output layer node from hidden node 

6. For each output node O;, calculate the total input as shown in eq.3. 


Oink = WOk + Yi _, Zjwjk (3) 
Now, apply the activation function on Oink to compute the output signal as in eq. 4: 
Ok = f(Oink) (4) 
Back-propagation learning rule (Phase II): 
7. Each output node receives the input training vector associated with the target pattern and computes the error using 
eq.5 
Ak = (tk — Ok) f'(Oink) (5) 
On the basis of error calculated, adjust the weights as given in eq.6 
AWjk =a Ak Zj (6) 
Send Ak back to the hidden layer 
8. Each hidden node calculate the sum of this delta from the output node using eq.7 
Ainj = Yi, AkWjk (7) 
Error is calculated as per eq.8 
Aj = Ainjf'(Zinj) (3) 
Adjust weight and bias as (Phase III): 
9. Each output and hidden nodes update bias and weights as: 
Wjk(new) = Wjk(old) + AWjk (9) 


WO0k(new) = WOk(old) + AWOk (10) 
10. Check whether the actual output equals the target output (stopping condition) 





3.2.2 TestingAlgorithm for the Pattern Classification 


1. Repeat steps 2 to 4 for each input 

2. Set the activation of input unit for X; 

3. At hidden node say X, calculate the net input and output as shown in eq. 11 and eq. 12 respectively: 
n 


Zinj = voj + >. Xivij 
f=! 


Zi = f(Zinj) 
4. At output node, compute the output as given in eq. 13 and eq.14 
Oink = WOk + Ye _, Zjwjk 





Ok = f(Oink) 


4. RESULTS AND ANALYSIS 

The experiments were conducted on the real dataset of 120 instances collected from SKIMS Soura, 
Srinagar, India. The subjects were chosen carefully covering wide range of population including men, 
women, old and youngsters. The values for eleven attributes were collected for all the 120 instances and 
some attributes have been quantified and factorized. The preprocessing of the dataset has been done in order 
to remove ambiguities, anomalies and errors. The pre-processed dataset is used to train the pattern classifier 
model using the back error propagation algorithm for the multi-layer perceptron. The data set has been split 
into train and test data instances. The machine learning model has been trained by varying the size of training 
dataset and then tested on test data set to achieve the cross validation. 
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The Figure 4 shows the best performance of MLP at epochs 9 on 30 instances with 7 attributes. 
In the Figure 5, the gradient error graph has been shown on 30 instances and 7 attributes. In the Figure 6 
shows the confusion matrix clearly reveals the performance of the pattern classifier (MLP). The green cells in 
the confusion matrix represent correctly classified instances while as the red cells represent incorrect 
classification. The blue box represents percentage of both correct as well as incorrect classification classes. 
ROC with 30 instances and 7 attributes, performance with 120 instances and 7 attributes as shown in 
Figure 7 and 8. 


Best Validation Performance is 0.16579 at epoch 9 Gradient = 0.079411, at epoch 15 
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Figure 4. Performance with 30 instances and 7 attributes Figure 5. Gradient error with 30 instances and 7 
attributes 
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Figure 6. Confusion matrix with 30 instances and 7 attributes 
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Figure 7. ROC with 30 instances and 7 attributes 


Best Validation Performance is 0.035558 at epoch 15 
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Figure 8. Performance with 120 instances and 7 attributes 


The proposed model of the diagnostic system has been evaluated at various numbers of training 


instances and features as shown in the Table1.In the initial step the training of the system has been carried out 
with 30 instances which used only 9 iterations with gradient error of 0.0314. The Table 2 clearly reveals that 
the training of the system with 30 samples results in an accuracy of 85% with testing accuracy of 40% and 
overall poor accuracy of 66.7%. The training of the proposed system with 7 attributes and 30 samples 
exhibits 14 iterations with gradient error of 9.95e-07. Furthermore, the number of instances were successfully 
increased for the training of the proposed system and performance is evaluated at 60,90 and 120 instances 
which is shown in the Table | and Table 3. 
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Tablel. Training Performance of the Thyroid System with 11 and 7 Fatures (epochs, gradient errors) 


Iterations Sample Size Features =11 Features =7 
Attempt Intances Epochs — Network Trained Gradient Error Epochs Network Trained Gradient Error 
1 30 9 YES 0.0314 14 yes 9.95e-07 
p) 60 25 YES 0.00029 13 yes 0.0118 
3 90 34 YES 0.000591 21 yes 0.00161 
4 120 56 YES 9.23e-07 21 yes 0.0103 


The Table 3 shows the training performance of the proposed system with 7 attributes at different 
number of instances. The overall accuracy of the proposed diagnostic system is ~100% with 11 attributes 
where as 99.2% with 7 attributes. On the other hand, the results clearly reveals that the number of attributes 
for thyroid diagnosis are independent of the number of instances in terms of overall accuracy which is ~99.8 
% in both the cases. Surprisingly, with lower number of instances for training the model (e.g. 30), the overall 
accuracy is better for seven attributes (70%) instead of eleven attributes (66.7%). The number of epochs 
required for training the model with eleven attributes increases with increment in the samples. 

The Table 4 illustrates the performance comparison of the proposed thyroid diagnostic system with 
the existing similar systems. The results clearly indicate that the numbers of attributes used for building a 
proposed model are much more than the other systems. I. D. Maysanjaya et.al.(2015) used MLP 
backpropogation pattern classifiers for thyroid diagnostic system based on 5 attributes with accuracy of 
96.7% which is outplayed by the proposed diagnosis system using 11 attributes with overall accuracy of 
~100%. The performance of the proposed system is better than the other similar systems proposed by Mazin 
Abdul Rasool (2009) and SaeedShariati et.al. (2010) in terms of prediction accuracy. 


Table 2. Performance of the Intelligent Thyroid Prediction System with | features (training, validation, 
testing, accuracy) 


Iteration No. Data Size Performance Jyamine Sangauon DESDE Overs” 
Accuracy Accuracy Accuracy Accuracy 

1 30 0.031 85% 20% A0% 66.7% 

2 60 9.87e-05 ~100% 717% ~100% 96.7% 

2 90 0.000158 ~100% ~100% ~100% ~ 100% 

4 120 3.10e-07 ~100% ~100% ~100% ~100% 


Table 3. Performance of theThyroid Prediction System with 7 Features (Accuracy) 


Iteration No. DataSize Performance Training Validation Testing Overall 
Accuracy Accuracy Accuracy Accuracy 

1 30 0.00209 75% 100% 20% 70% 

2 60 0.0124 95.2% 100% 77.8% 93.3%s 

3 90 0.000663 100% 100% 92.9% 98.9% 

4 120 0.00852 ~100% 94.4% ~100% 99.2% 


Table 4. Comparative Analysis of Various Pattern Classifiers for Thyroid Prediction System with Our 


System 
Author Diagonosis Features Machine learning technique used Performance 

I. D. Maysanjaya et.al. (2015) Thyroid 5 MLP (Back propogation) Accuracy=96.7 % 
Mazin AbdulRasool (2009) Thyroid 3 MLP -- 
Saeedshariati, Mahdi motavali Hepatitis and -- Self organized fuzzy system 
(2010) Thyroid -- 

Accuracy fwith 7 
Proposed thyroid predictio system Thyroid Zand Il MLP (Back propogation) feature s=99.2% 
(2018) Accuracy fwith 11 


Features = 99.8% 


5. CONCLUSION 

The intelligent prediction and classification has been achieved by training the MLP model with a 
mixed dataset of various subjects collected from individuals living in different habitats. The proposed model 
was successfully tested with random samples including instances of hyperthyroid, hypothyroid and normal 
individuals. The system exhibits excellent training and testing accuracy of almost 100% with 11 attributes 
and 99.2% with 7 attributes. The proposed thyroid prediction system exhibits better prediction accuracy of 
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nearly 99.8% with 7 or 11 of features as compared to existing similar systems which used only 3 to5 features. 
Moreover, comparison has been carried out between the two experiments conducted with different number of 
decision attributes. As a future research, the ensembles using Random forests, Bagging and Boosting 
machine learning advanced techniques may be used to improve the accuracy. Furthermore, the proposed 
machine learning model may be extended to diagnose other types of lifestyle diseases like diabetes, blood 
pressure and many more. The deep learning techniques like CNN may be used to further incorporate the 
intelligence in the proposed system. 
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